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Why are most empirical networks, witii the prominent exception of social ones, generically degree- 
degree anticorrelated, i.e. disassortative? With a view to answering this long-standing question, 
we define a general class of degree-degree correlated networks and obtain the associated Shannon 
entropy as a function of parameters. It turns out that the maximum entropy does not typically 
correspond to uncorrelated networks, but to either assortative (correlated) or disassortative (an- 
ticorrelated) ones. More specifically, for highly heterogeneous (scale-free) networks, the maximum 
entropy principle usually leads to disassortativity, providing a parsimonious explanation to the ques- 
tion above. Furthermore, by comparing the correlations measured in some real-world networks with 
those yielding maximum entropy for the same degree sequence, we find a remarkable agreement in 
various cases. Our approach provides a neutral model from which, in the absence of further knowl- 
edge regarding network evolution, one can obtain the expected value of correlations. In cases in 
which empirical observations deviate from the neutral predictions - as happens in social networks 
- one can then infer that there are specific correlating mechanisms at work. 
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Complex networks, whether natural or artificial, have 
non-trivial topologies which are usually studied by 
analysing a variety of measures, such as the degree dis- 
tribution, clustering, average paths, modularity, etc. [H- 
0] The mechanisms which lead to a particular structure 
and their relation to functional constraints are often not 
clear and constitute the subject of much debate 0, Q. 
When nodes are endowed with some additional "prop- 
erty," a feature known as mixing or assortativity can 
arise, whereby edges are not placed between nodes com- 
pletely at random, but depending in some way on the 
property in question. If similar (dissimilar) nodes tend 
to wire together, the network is said to be assortative 
{disassortative) [4|. 

An interesting situation is when the property taken 
into account is the degree of each node - i.e., the num- 
ber of neighboring nodes connected to it. It turns out 
that a high proportion of empirical networks - whether 
biological, technological, information-related or linguis- 
tic - are disassortatively arranged (high-degree nodes, or 
hubs, are preferentially linked to low-degree neighbors, 
and viceversa) while social networks are usually assor- 
tative. Such degree-degree correlations have important 
consequences for network characteristics such as connect- 
edness and robustness f3|. 

However, while assortativity in social networks can be 
explained taking into account homophily [3j or modular- 
ity [3| , the widespread prevalence and extent of disassor- 
tative mixing in most other networks remains somewhat 
mysterious. Maslov et al. found that the restriction of 
having at most one edge per pair of nodes induces some 
disassortative correlations in heterogeneous networks 6] , 
and Park and Newman showed how this analogue of 
the Pauli exclusion principle leads to the edges following 



Fermi statistics (see also Q). However, this restric- 
tion is not sufficient to fully account for empirical data. 
In general, when one attempts to consider computation- 
ally all the networks with the same distribution as a given 
empirical one, the mean assortativity is not necessarily 
zero ^ . But since some "randomization" mechanisms in- 
duce positive correlations and others negative ones px| . 
it is not clear how the phase space can be properly sam- 
pled numerically. 

In this letter, we show that there is a general reason, 
consistent with empirical data, for the "natural" mix- 
ing of most networks to be disassortative. Using an 
information-theory approach we find that the configu- 
ration which can be expected to come about in the ab- 
sence of specific additional constraints turns out not to 
be, in general, uncorrelated. In fact, for highly hetero- 
geneous degree distributions such as those of the ubiq- 
uitous scale-free networks, we show that the expected 
value of the mixing is usually disassortative: there are 
simply more possible disassortative configurations than 
assortative ones. This result provides a simple topolog- 
ical answer to a long-standing question. Let us caution 
that this does not imply that all scale-free networks are 
disassortative, but only that, in the absence of further 
information on the mechanisms behind their evolution, 
this is the neutral expectation. 

The topology of a network is entirely described by its 
adjacency matrix a; the element represents the num- 
ber of edges linking node i to node j (for undirected net- 
works, a is symmetric). Among all the possible micro- 
scopically distinguishable configurations a set of L edges 
can adopt when distributed among TV nodes, it is often 
convenient to consider the set of configurations which 
have certain features in common - typically some macro- 
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scopic magnitude, like the degree distribution. Such a 
set of configurations defines an ensemble. In a seminal 
series of papers Bianconi has determined the partition 
functions of various ensembles of random networks and 
derived their statistical-mechanics entropy [llj. This al- 
lows the author to estimate the probability that a ran- 
dom network with certain constraints has of belonging to 
a particular ensemble, and thus assess the relative impor- 
tance of different magnitudes and help discern the mech- 
anisms responsible for a given real-world network. For 
instance, she shows that scale-free networks arise natu- 
rally when the total entropy is restricted to a small finite 
value. Here we take a similar approach: we obtain the 
Shannon information entropy encoded in the distribution 
of edges. As we shall see, both methods yield the same 
results jl^] , but for our purposes the Shannon entropy is 
more tractable. 

The Shannon entropy associated with a probability dis- 
tribution pm is s = — X^mP™ whcrc the sum ex- 
tends over all possible outcomes m. For a given pair 
of nodes Pm can be considered to represent the 
probability of there being m edges between i and j. For 
simplicity, we shall focus here on networks such that aij 
can only take values or 1, although the method is ap- 
plicable to any number of edges allowed. In this case, 
we have only two terms: pi = iij and po = 1 — iij, 
where iij = E(aij) is the expected value of the element 
ttij given that the network belongs to the ensemble of 
interest. The entropy associated with pair is then 
Sij = — [cij In(eij) + (1 — eij)ln(l — iij)], while the total 
entropy of the network is S" = : 

N 

5 = - E ^^^0 + (1 - ^'j-) - ■ (1) 

Since we have not imposed symmetry of the adjacency 
matrix, this expression is in general valid for directed 
networks. For undirected networks, however, the sum is 
only over i < j, with the consequent reduction in entropy. 

For the sake of illustration, we shall estimate the en- 
tropy of the Internet at the autonomous system (AS) 
level and compare it with the values obtained in 
assuming the network belongs to two different ensem- 
bles: the fully random graph, or Erdos-Renyi (ER) en- 
semble, and the configuration ensemble with a scale- 
free degree distribution {p{k) ^ k~^) ^ and struc- 
tural cutoff, ki < {k)N, V« JJj] ((fc) is the mean de- 
gree). In this example, we assume the network to be 
sparse enough to expand the term ln(l — iij) in Eq. 
([T]) and keep only linear terms. This reduces Eq. ([T]) 

to Ssparse ^ - J2tj ^^j[M^^j) - 1] + ^(e-,). In tlic ER 
ensemble, each of N nodes has an equal probability of 
receiving each of \{k)N undirected edges. So, writing 
ifR = {k)/N, we have Ser^ = -\{k)N [\n{{k) /N) - I] . 
The configuration ensemble, which imposes a given de- 
gree sequence (fci, .../cat), is defined via the expected value 



of the adjacency matrix: e^^- = kikj/{{k)N) This 
value leads to S'c = (fc)iV[ln((fc)A^)-hl]-27V(fc In fc), where 
(■) = N^^ stands for an average over nodes. 
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FIG. 1: (Color online) Evolution of the Internet at the AS 
level. Empty (blue) squares and circles: entropy per node of 
randomized networks in the fully random and in the configu- 
ration ensembles, as obtained by Bianconi (hence the "B" su- 
perscription) Filled (red) triangles and diamonds: Shan- 
non entropy for an ER network and a scale-free one with 
7 = 2.3, respectively. 

Fig. [1] displays the entropy per node obtained in [ll| 
for the first two levels of approximation (ensembles) to 
the Internet at the AS level, first taking into account 
only the numbers of nodes A'^ and edges L = ^{k)N, and 
then also the degree sequence. Alongside these, we plot 
the Shannon entropy both for an ER random network, 
(which coincides exactly with Bianconi's expression) , and 
for a scale-free network with 7 = 2.3 (the slight disparity 
arising from this exponent's changing a little with time). 

We shall now go on to analyse the effect of degree- 
degree correlations on the entropy. In the configuration 
ensemble, the expected value of the mean degree of the 
neighbors of a given node is fc„„,i = k~^ ^ij^j = 
(fc^)/(fc), which is independent of ki. However, as men- 
tioned above, real networks often display degree-degree 
correlations, with the result that knn,i = knn{ki). If 
knn{k) increases (decreases) with k, the network is assor- 
tative (disassortative). A measure of this phenomenon 
is Pearson's coefficient applied to the edges 0-11]: r = 
{[kik'i] - [ki]'^)/{[kf] - [fc;]^), where ki and fc,' are the de- 
grees of each of the two nodes belonging to edge I, and 
[•] = {{k)N)~^J2i{') is an average over edges. Writing 
J2ii') — J2ij ^iji')^ ^ can be expressed as 

_ (fc)(fcXn(fe))-(fcy 

(fc)(fc3) _ (P)2 ■ ^ > 

The ensemble of all networks with a given degree se- 
quence (fci,...fcjv) contains a subset for all members of 
which fc„„(fc) is constant (the configuration ensemble), 
but also subsets displaying other functions fc„„(fc). We 
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FIG. 2: (Color online) Shannon entropy of correlated scale- 
free networks against parameter P (left panel) and against 
Pearson's coefficient r (right panel), for various values of 7 
(increasing from bottom to top), (k) = 10, A'^ — 10^. 



can identify each one of these subsets (regions of phase 
space) with an expected adjacency matrix e which simul- 
taneously satisfies the following conditions: i) kjiij = 
kiknniki), Vi, and ii) J^j^ij — ki, Vi (for consistency). 
An ansatz which fulfills these requirements is any matrix 
of the form 



ki kj 
{k)N 



dv 



N 



jkikj)' 



{kn 



(3) 

where e M and the function /{i^) is in general arbitrary, 
although depending on the degree sequence it shall here 
be restricted to values which maintain e.^ G [0, 1], Vi, j. 
This ansatz yields 



knn {k) 



(k) 



dvf{v)a^ 



+1 



(4) 



(the first term being the result for the configuration 
ensemble), where ab+i = {k^'^^) — (fc)(fc''). In practice, 
one could adjust Eq. to fit any given function knn(k) 
and then wire up a network with the desired correlations: 
it suffices to throw random numbers according to Eg. 
dSl) with f{v) as obtained from the fit to Eq. dH) fli| . 
To prove the uniqueness of a matrix e obtained in this 
way (i.e., that it is the only one compatible with a 
given knn{k)) assume that there exists another valid 
matrix e' 7^ e. Writting e[j — e.^ = h(ki,kj) ~ hij, then 
i) implies that kjhij — 0, Vi, while ii) means that 
^ij = 0, V«. It follows that hij — 0, Vj. 



In many empirical networks, fc„„(fc) has the form 
fc„„(fc) = A + Bkf^, with A, B > 0, [ill - the mixing 
being assortative (disassortative) if (5 is positive (neg- 
ative). Such a case is fitted by Eq. ^ if f(y) = 
C[5{v — /3 — 1)ct2/o'^;+2 — 5{v — 1)], with C a positive 
constant, since this choice yields 



knnik) 



{k) 



Cg2 



kp 



(fc'9 + 1) 



1 



(5) 



After plugging Eq. ([5]) into Eq. ([2]), one obtains: 



Ca2 f{k){kP+^)~{k^){kP+'^) 



(fc/S+i) V {k){k^) - (fc2)2 



(6) 



Inserting Eq. ^ in Eq. ([T]), we can calculate the en- 
tropy of correlated networks as a function of /3 and C - 
or, by using Eq. ([6]), as a function of r. Particularizing 
for scale-free networks, then given (k), N and 7, there 
is always a certain combination of parameters /3 and C 
which maximizes the entropy; we shall call these /3* and 
C*. For 7 < 5/2 this point corresponds to C* = 1. For 
higher 7, the entropy can be slightly higher for larger 
C. However, for these values of 7, the assortativity r 
of the point of maximum entropy obtained with C = 1 
differs very little from the one corresponding to /3* and 
C* (data not shown). Therefore, for the sake of clarity 
but with very little loss of accuracy, in the following we 
shall generically set C = 1 and vary only (3 in our search 
for the level of assortativity, r* , that maximizes the en- 
tropy given (fc), N and 7. Note that C = 1 corresponds 
to removing the linear term, proportional to kikj, in Eq. 
dSl), and leaving the leading non-linearity, {kikjY^^ , as 
the dominant one. 
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FIG. 3: (Color online) Lines from top to bottom: r at which 
the entropy is maximized, r* , against 7 for random scale- free 
networks with mean degrees (fc) = i, 1, 2 and 4 times fco = 
5.981, and N = No = 10697 nodes {ko and A^o correspond to 
the values for the Internet at the AS level in 2001 0, which 
had r — ro — —0.189). Symbols are the values obtained 
in 01 as those expected solely due to the one-edge-per-pair 
restriction (with ko, No and 7 = 2.1, 2.3 and 2.5). Inset: r* 
against A'^ for networks with fixed {k)/N (same values as the 
main panel) and 7 = 2.5; the arrow indicates N = No- 



Fig. [5] displays the entropy curves for various scale- 
free networks, both as functions of /? and of r: depending 
on the value of 7, the point of maximum entropy can 
he either assortative or disassortative. This can be seen 
more clearly in Fig. [U where r* is plotted against 7 
for scale-free networks with various mean degrees (fc). 
The values obtained by Park and Newman Q as those 
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resulting from the one-edge-per-pair restriction are also 
shown for comparison: notice that whereas this effect 
alone cannot account for the Internet's correlations for 
any 7, entropy considerations would suffice if 7 ~ 2.1. 
As shown in the inset, the results are robust in the large 
system-size limit (although see [l6|). 

Since most networks observed in the real world are 
highly heterogeneous, with exponents in the range 7 £ 
(2,3), it is to be expected that these should display a 
certain disassortativity - the more so the lower 7 and 
the higher (k). In Fig. 0] we test this prediction on a 
sample of empirical, scale-free networks quoted in New- 
man's review (p. 182). For each case, we found the 
value of r that maximizes S according to Eq. ([T]), af- 
ter inserting Eq. ([3|) with the quoted values of (fc), N 
and 7. In this way, we obtained the expected assortativ- 
ity for six networks, representing: a peer-to-peer (P2P) 
network, metabolic reactions, the nd.edu domain, actor 
collaborations, protein interactions, and the Internet (see 
[1| and references therein). For the metabolic, Web do- 
main and protein networks, the values predicted are in ex- 
cellent agreement with the measured ones; therefore, no 
specific anticorrelating mechanisms need to be invoked 
to account for their disassortativity. In the other three 
cases, however, the predictions are not accurate, so there 
must be additional correlating mechanisms at work. In- 
deed, it is known that small routers tend to connect to 
large ones [l5|, so one would expect the Internet to be 
more disassortative than predicted, as is the case 



an effect that is less pronounced but still detectable in the 
more egalitarian P2P network. Finally, as is typical of 
social networks, the actor graph is significantly more as- 
sortative than predicted, probably due to the homophily 
mechanism whereby highly connected, big-name actors 
tend to work together [4| . 
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FIG. 4: (Color online) Level of assortativity that maximizes 
the entropy, r* , for various real- world, scale-free networks, 
as predicted theoretically by Eq. ([T]) (solid symbols) and as 
directly measured (empty symbols), against exponent 7. 



In summary, we have shown how the ensemble of net- 



works with a given degree sequence can be partitioned 
into regions of equally correlated networks and found, 
using an information-theory approach, that the largest 
(maximum entropy) region, for the case of scale-free net- 
works, usually displays a certain disassortativity. There- 
fore, in the absence of knowledge regarding the specific 
evolutionary forces at work, this should be considered the 
most likely state. Given the accuracy with which our ap- 
proach can predict the degree of assortativity of certain 
empirical networks with no a priori information thereon^ 
we suggest this as a neutral model to decide whether or 
not particular experimental data require specific mecha- 
nisms to account for observed degree-degree correlations. 
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